Latency Considerations of Depth-first GPU Ray Tracing

نویسنده

  • Michael Guthe
چکیده

Despite the potential divergence of depth-first ray tracing [AL09], it is nevertheless the most efficient approach on massively parallel graphics processors. Due to the use of specialized caching strategies that were originally developed for texture access, it has been shown to be compute rather than bandwidth limited. Especially with recents developments however, not only the raw bandwidth, but also the latency for both memory access and read after write register dependencies can become a limiting factor. In this paper we will analyze the memory and instruction dependency latencies of depth first ray tracing. We will show that ray tracing is in fact latency limited on current GPUs and propose three simple strategies to better hide the latencies. This way, we come significantly closer to the maximum performance of the GPU.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient GPU Screen-Space Ray Tracing

We present an efficient GPU solution for screen-space 3D ray tracing against a depth buffer by adapting the perspective-correct DDA line rasterization algorithm. Compared to linear ray marching, this ensures sampling at a contiguous set of pixels and no oversampling. This paper provides for the first time full implementation details of a method that has been proven in production of recent major...

متن کامل

GPU Rendering of Secondary Effects

In this paper we present an efficient data structure and algorithms for GPU ray tracing of secondary effects like reflections, refractions and shadows. Our method extends previous work on layered depth cubes in that it uses layered depth cubes as an adaptive space partitioning scheme for ray tracing. We propose a new method to efficiently build LDCs on the GPU using geometry shaders available i...

متن کامل

Fast Ray Sorting and Breadth-First Packet Traversal for GPU Ray Tracing

We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose a standard ray tracing algorithm into several data-parallel stages that are mapped efficiently to the massively parallel architecture of modern GPUs. These stages include: ray sorting into coherent packets, creation of frustums for packets, breadth-first frustum traversal through a bound...

متن کامل

Algorithm optimizations and mapping scheme for interactive ray tracing on a reconfigurable architecture

This paper presents a mapping scheme of an optimized octree-based ray tracing algorithm and its implementation on a SIMD reconfigurable architecture, MorphoSys, with appropriate hardware incorporated. A two-level SIMD mapping scheme for ray tracing is chosen to get better trade-off between coherence exploitation efficiency and bandwidth requirements. We apply an SIMD octree traversal algorithm ...

متن کامل

GPU-Based Ray-Casting of Quadratic Surfaces

Quadratic surfaces are frequently used primitives in geometric modeling and scientific visualization, such as rendering of tensor fields, particles, and molecular structures. While high visual quality can be achieved using sophisticated ray tracing techniques, interactive applications typically use either coarsely tessellated polygonal approximations or pre-rendered depth sprites, thereby tradi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014